80 research outputs found

    Reconnaissance de sons d'eau pour l'indexation en activités de la vie quotidienne

    Get PDF
    National audienceL’évaluation de troubles dans la réalisation des activités quotidienne est aujourd’hui utilisée dans le diagnostic des démences, mais se heurte à un manque d’outils objectifs. Pour pallier ce manque, le projet IMMED propose la réalisation de vidéo au domicile du patient et l’indexation automatique de ces vidéos en activités. Ces vidéos indexées permettent aux spécialistes de visualiser les patients effectuer des activités dans leur environnement habituel. Dans ce contexte, de nombreuses tâches quotidiennes ont un rapport avec l’eau : se laver les mains, faire la vaisselle, se brosser les dents, etc. Dans cet article, nous présentons deux méthodes de détection de sons d’eau pour la segmentation automatique en activité. La première méthode, basée sur des descripteurs acoustiques, permet la détection du flot d’eau. Pour reconnaître les autres types de sons d’eau, comme les gouttes, nous présentons également une approche originale qui s’appuie sur des modèles acoustiques des sons de liquide

    Water sound recognition based on physical models

    Get PDF
    International audienceThis article describes an audio signal processing algorithm to detect water sounds, built in the context of a larger system aiming to monitor daily activities of elderly people. While previous proposals for water sound recognition relied on classical machine learning and generic audio features to characterize water sounds as a flow texture, we describe here a recognition system based on a physical model of air bubble acoustics. This system is able to recognize a wide variety of water sounds and does not require training. It is validated on a home environmental sound corpus with a classification task, in which all water sounds are correctly detected. In a free detection task on a real life recording, it outperformed the classical systems and obtained 70% of F-measure

    Intérêt du suivi de fréquences pour la détection de sources harmoniques multiples

    Get PDF
    National audienceDans cet article, nous présentons une nouvelle approche pour la localisation de superposition de sources harmoniques. Notre méthode est basée sur le suivi des fréquence prédominantes du signal afin de former des segments sinusoïdaux. Les relations entre les fréquences de ces derniers sont ensuite étudiées afin de regrouper les segments sinusoïdaux appartenant à une même source. Les sources étant localisées sur le plan temps-fréquence, les zones où coexistent différentes sources sont finalement extraites. Notre approche à été testée à la fois sur des contenus de parole et de musique avec des résultats prometteurs

    Speaker verification using Large Margin GMM discriminative training

    Get PDF
    International audienceGaussian mixture models (GMM) have been widely and successfully used in speaker recognition during the last decades. They are generally trained using the generative criterion of maximum likelihood estimation. In an earlier work, we proposed an algorithm for discriminative training of GMM with diagonal covariances under a large margin criterion. In this paper, we present a new version of this algorithm which has the major advantage of being computationally highly efficient. The resulting algorithm is thus well suited to handle large scale databases. To show the effectiveness of the new algorithm, we carry out a full NIST speaker verification task using NISTSRE' 2006 data. The results show that our system outperforms the baseline GMM, and with high computational efficiency

    Apprentissage discriminant des GMM à grande marge pour la vérification automatique du locuteur

    Get PDF
    National audienceGaussian mixture models (GMM) have been widely and successfully used in speaker recognition during the last decades. They are generally trained using the generative criterion of maximum likelihood estimation. In an earlier work, we proposed an algorithm for discriminative training of GMM with diagonal covariances under a large margin criterion. In this paper, we present a new version of this algorithm which has the major advantage of being computationally highly efficient. The resulting algorithm is thus well suited to handle large scale databases. To show the effectiveness of the new algorithm, we carry out a full NIST speaker verification task using NIST-SRE'2006 data. The results show that our system outperforms the baseline GMM, and with high computational efficiency

    Combination of SVM and Large Margin GMM modeling for speaker identification

    Get PDF
    International audienceMost state-of-the-art speaker recognition systems are partially or completely based on Gaussian mixture models (GMM). GMM have been widely and successfully used in speaker recognition during the last decades. They are traditionally estimated from a world model using the generative criterion of Maximum A Posteriori. In an earlier work, we proposed an efficient algorithm for discriminative learning of GMM with diagonal covariances under a large margin criterion. In this paper, we evaluate the combination of the large margin GMM modeling approach with SVM in the setting of speaker identification. We carry out a full NIST speaker identification task using NIST-SRE'2006 data, in a Symmetrical Factor Analysis compensation scheme. The results show that the two modeling approaches are complementary and that their combination outperforms their single use

    Large Margin GMM for discriminative speaker verifi cation

    Get PDF
    International audienceGaussian mixture models (GMM), trained using the generative cri- terion of maximum likelihood estimation, have been the most popular ap- proach in speaker recognition during the last decades. This approach is also widely used in many other classi cation tasks and applications. Generative learning in not however the optimal way to address classi cation problems. In this paper we rst present a new algorithm for discriminative learning of diagonal GMM under a large margin criterion. This algorithm has the ma- jor advantage of being highly e cient, which allow fast discriminative GMM training using large scale databases. We then evaluate its performances on a full NIST speaker veri cation task using NIST-SRE'2006 data. In particular, we use the popular Symmetrical Factor Analysis (SFA) for session variability compensation. The results show that our system outperforms the state-of-the- art approaches of GMM-SFA and the SVM-based one, GSL-NAP. Relative reductions of the Equal Error Rate of about 9.33% and 14.88% are respec- tively achieved over these systems

    Segmentation in singer turns with the Bayesian Information Criterion

    Get PDF
    National audienceAs part of a project on indexing ethno-musicological audio recordings, segmentation in singer turns automatically appeared to be essential. In this article, we present the problem of segmentation in singer turns of musical recordings and our first experiments in this direction by exploring a method based on the Bayesian Information Criterion (BIC), which are used in numerous works in audio segmentation, to detect singer turns. The BIC penalty coefficient was shown to vary when determining its value to achieve the best performance for each recording. In order to avoid the decision about which single value is best for all the documents, we propose to combine several segmentations obtained with different values of this parameter. This method consists of taking a posteriori decisions on which segment boundaries are to be kept. A gain of 7.1% in terms of F-measure was obtained compared to a standard coefficient

    Fast training of Large Margin diagonal Gaussian mixture models for speaker identification

    Get PDF
    International audienceGaussian mixture models (GMM) have been widely and successfully used in speaker recognition during the last decades. They are generally trained using the generative criterion of maximum likelihood estimation. In an earlier work, we proposed an algorithm for discriminative training of GMM with diagonal covariances under a large margin criterion. In this paper, we present a new version of this algorithm which has the major advantage of being computationally highly efficient. The resulting algorithm is thus well suited to handle large scale databases. We carry out experiments on a speaker identification task using NIST-SRE'2006 data and compare our new algorithm to the baseline generative GMM using different GMM sizes. The results show that our system significantly outperforms the baseline GMM in all configurations, and with high computational efficiency

    Speaker Identication Using Discriminative Learning of Large Margin GMM

    Get PDF
    International audienceGaussian mixture models (GMM) have been widely and suc- cessfully used in speaker recognition during the last decades. They are generally trained using the generative criterion of maximum likelihood estimation. In an earlier work, we proposed an algorithm for discrimi- native training of GMM with diagonal covariances under a large margin criterion. In this paper, we present a new version of this algorithm which has the major advantage of being computationally highly e cient, thus well suited to handle large scale databases. We evaluate our fast algo- rithm in a Symmetrical Factor Analysis compensation scheme. We carry out a full NIST speaker identi cation task using NIST-SRE'2006 data. The results show that our system outperforms the traditional discrimina- tive approach of SVM-GMM supervectors. A 3.5% speaker identi cation rate improvement is achieved
    • …
    corecore